-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADR for updated data model #182
base: main
Are you sure you want to change the base?
Conversation
d8e9977
to
992bbe1
Compare
9007447
to
720037a
Compare
This proposes a richer data structure to help us model how the various parts of "a decision" exist within the service, both logically and conceptually.
720037a
to
7ede3ce
Compare
|
||
Documents are the overarching item in the data structure, and are what most people will actually mean when they talk about a "judgment". Each document MUST be assigned a unique, non-semantic identifier by the Find Case Law service, and may have one or more other identifiers such as NCNs. | ||
|
||
Where a relationship exists between two documents (eg "X is a press summary of Y") this relationship would likely be stored bidirectionally, ie "is summarised by" and "is a summary of" to simplify retrieval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth being specific that a judgment and its press summary are different documents, and that we don't yet have a settled opinion on language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the relationship of revisions to its document look like?
|
||
A revision represents a distinct submission of a document to the National Archives, usually by a court or tribunal. For new submissions this will usually be via TDR, but some legacy ingestions may have been done via other means. | ||
|
||
A revision SHOULD have a "source document" which we consider to be the canonical representation of the revision, and from which all other representations are derived. This will usually be a .docx file for all new submissions, but could also be other types of file for legacy ingestions or future submissions. It is possible that legacy ingestions will no longer have the original file available for all past revisions (although this will remain in The National Archives' preservation system). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the source document in practice a link to S3 and maybe a hash of that file?
Our existing data model is starting to creak under the load, and needs a bit of a refresh. This ADR proposes a new structure for the data to better support future requirements.